ACT  Research  Report  Series 


89-1 


AD-A207  854 

A  Comparison  of  the  Effects  of 
Random  Versus  Fixed  Order 
of  Item  Presentation  Via  the 
Computer 

Research  Report  01NIR89-1 

Terry  A.  Ackerman 
Judith  A.  Spray 
Mark  D.  Reckase 
James  E.  Carlson 


Prepared  under  Contract  No.  N00014-85-C-024I,  Contract  Authority 
Identification  No.  NR  154-531,  with  the  Cognitive  Science  Research  Program  of 
the  Office  of  Naval  Research. 

Approved  for  public  release;  distribution  unlimited.  Reproduction  in  whole  or  in 
part  is  permitted  for  any  purpose  of  the  United  States  Governmenfc 


February  1989 


DTIC 

ELECTS 
MAY  08  1989 

H 


0* 


89 


5 


n  o 

U  O 


For  additional  copies  write: 
ACT  Research  Report  Series 
P.O.  Box  168 
Iowa  City,  Iowa  52243 


®  1989  toy  The  American  College  Testing  Program.  All  rights  reserved. 


Unc lass i f ied 

SECU  Ri  T  v  CLASS  FCAT'ON  0-  fws  PAC-E 


la  REPORT  SECURITY  CLASSIFICATION 

Unclassified 

2a  SECURITY  CLASSIFICATION  AUTHORITY 


REPORT  DOCUMENTATION  PAGE 

lib  RESTRICTIVE  MARKINGS 


2b  DECLASSIFICATION  ,  DOWNGRADING  SCHEDULE 
4  PERFORMING  ORGANIZATION  REPORT  NUMBER(S) 

ONR  89-1 


6a  NAME  OF  PERFORMING  ORGANIZATION 
ACT 

6c.  ADDRESS  (City,  State,  and  ZIP  Code) 
P.O.  Box  168 
Iowa  City,  IA  52243 

8a  NAME  OF  FUNDING  /  SPONSORING 
ORGANIZATION 

8c  ADDRESS  (City,  State,  and  ZIP  Code) 


[6b  OFFICE  SYMBOL 
(If  applicable) 


3  DISTRIBUTION /AVAILABILITY  OF  REPORT 

Approved  for  public  release:  distribution 
unlimited.  Reproduction  in  whole  or  in 
part  is  permitted  for  any  purpose  of  the  US  Got 
5  MONITORING  ORGANIZATION  REPORT  NUMBER(S) 


7 a  NAME  OF  MONITORING  ORGANIZATION 

Cognitive  Science  Research  Programs 

Office  of  Naval  Research 


7b  ADDRESS  (City,  State,  and  ZIP  Code) 

Code  1142CS 

Arlington,  VA  22217-5000 


8b  OFFICE  SYMBOL  9  PROCUREMENT  INSTRUMENT  IDENTIFICATION  NUMBER 
(If  applicable) 

N00014-85-C-0241 


8c.  ADDRESS  (City,  State,  and  ZIP  Code)  10  SOURCE  OF  FUNDING  NUMBERS 

PROGRAM  PROJECT  TASK  (WORK 

ELEMENT  NO  NO  NO  ACCESS 

61153N  RR04204  RR042041  NR154 

11  TITLE  (Include  Security  Classification) 

A  comparison  of  the  effects  of  random  versus  fixed  order  of  item  presentation  via  the 
computer _ 

12  PERSONAL  AUTHOR(S) 

Terry  A.  Ackerman,  Judith  A.  Spray,  Mark  D.  Reckase,  James  E.  Carlson 


WORK  UNIT 
ACCESSION  NO 


13a  TYPE  OF  REPORT 

Technical 

16  SUPPLEMENTARY  NOTATION 


13b  TIME  COVERED 
FROM  _  TO 


14  DATE  OF  REPORT  (Year,  Month,  Day)  15  PAGE  COUNT 

89/2/1  37 


17 

COSATI  CODES  | 

FIELD 

GROUP 

SUB-GROUP 

J£ _ 

f 

18  SUBJECT  TERMS  (Continue  on  reverse  if  necessary  and  identify  by  block  number) 

^/Computerized  testing  ' 'yf~-  ^ 


19  ABSTRACT  ( Continue  on  reverse  if  necessary  and  identify  by  block  number) 

The  effect  of  random  versus  fixed  order  of  item  presentation  was  studied  using  a 
computerized  testing  system  at  the  Marine  Corps  Communication-Electronics  School  (MCCES) 
at  the  Twentynine  Palms  Marine  Base  in  southern  California.  Classes  from  four  different 
annexes  were  randomly  divided  between  the  two  administrative  formats.  Similar  results 
were  found  for  each  annex.  The  results  suggest  that  when  MCCES  items  are  administered 
via  the  computer,  order  of  item  presentation  makes  at  most  a  very  small  difference. 

I  Implications  and  future  directiosn  are  discuS&ed.y  i  if  j  „  f  ■ 


20  DISTRIBUTION /AVAILABILITY  OF  ABSTRACT 
(S  UNCLASSlF I EQ/UN LIMITED  □  SAME  AS  RPT 
22a  NAME  OF  RESPONSIBLE  INDIVIDUAL 
Dr.  Charles  Davis 


21  ABSTRACT  SECURITY  CLASSIFICATION 
□  otic  USERS  Unclassified _ _ 

22b  TELEPHONE  (Include  Area  Code)  22c  OFFICE  SYMBOL 

202/696-4046  ONR  1142CS 


OO  FORM  1473, 84  mar 


83  APR  edition  may  be  used  until  exhausted 
All  other  editions  are  obsolete 


SECURITY  CLASSIFICATION  OF  THIS  PAGE 


A  COMPARISON  OF  THE  EFFECTS  OF  RANDOM  VERSUS  FIXED  ORDER 
OF  ITEM  PRESENTATION  VIA  THE  COMPUTER 

Terry  A.  Ackerman 
Judith  A.  Spray 
Mark  D.  Reckase 
James  E.  Carlson 


Approved  for  public  release;  distribution  unlimited.  Reproduction  in  whole  or 
in  part  is  permitted  for  any  purpose  of  the  United  States  Government. 


ABSTRACT 

The  effect  of  random  versus  fixed  order  of  item  presentation  was  studied 
using  a  computerized  testing  system  at  the  Marine  Corps  Communication- 
Electronics  School  (MCCES)  at  the  Twentynine  Palms  Marine  Base  in  southern 
California.  Classes  from  four  different  annexes  were  randomly  divided  between 
the  two  administrative  formats.  Similar  results  were  found  for  each  annex. 

The  results  suggest  that  when  MCCES  items  are  administered  via  the  computer, 
order  of  item  presentation  makes  at  most  a  very  small  difference.  Implications 
and  future  directions  are  discussed. 


j  Accession  For 

!  NTIS  GPUI 
j  D7IC  TAB 
I  Unaru*  c  on  e  o  d 
j  Jui  L  ^  i  -Crltj'j  n _ 


!  By _ _ _ _ 

!  Di st  r i but i ou/ 


Availability  Codas 


Dist  I  Sn?cifll 


A  COMPARISON  OF  THE  EFFECTS  OF  RANDOM  VERSUS  FIXED  ORDER  OF  ITEM 
PRESENTATION  VIA  THE  COMPUTER 

Introduction 

As  part  of  the  research  on  the  Office  of  Naval  Research  Contract  N00014- 
85-C0241,  a  computerized  testing  system  was  designed  and  installed  at  the 
Marine  Corps  Communication-Electronics  School  (MCCES)  at  the  Twentynine  Palms 
Marine  Base  in  southern  California.  The  hardware  was  designed  as  part  of  a 
research  project  which  had  three  phases.  Phase  I  was  designed  to  compare 
paper  and  pencil  and  computer-administered  modes  of  test  administration.  In 
the  second  phase,  all  testing  utilized  the  computer  administration  mode  and 
comparisons  of  the  effects  of  random  vs.  fixed  item  order  were  investigated. 
The  final  phase  consisted  of  implementat ing  a  complete  computerized  adaptive 
testing  system  using  item  parameter  estimates  which  were  calibrated  from 
Phase  I  and  Phase  II  response  data. 

Results  from  Phase  I  are  completely  described  in  (Spray,  Ackerman, 
Carlson  &  Reckase,  1985.)  This  report,  which  summarizes  the  results  of  Phase 
II,  parallels  the  Phase  I  report. 

Method 

In  Phase  II,  the  effect  of  random  versus  fixed  item  order  was  studied  in 
four  courses  called  "annexes":  GR01,  GR02,  GR03  and  GR05.  For  the  purposes 
of  this  study,  classes  within  each  of  the  four  annexes  were  divided  into  two 
groups  according  to  the  last  four  digits  of  their  social  security  number.  If 
a  student's  social  security  number  was  odd  he  or  she  would  be  assigned  to  a 
"fixed  order"  group;  if  the  number  was  even  he  or  she  was  assigned  to  a 
"random  order"  group.  Both  groups  were  given  the  exact  same  items,  however, 
each  member  of  the  fixed  group  was  presented  the  items  in  exactly  the  same 
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fixed  order;  where  as  each  member  of  the  random  group  was  presented  the  items 
in  a  different  random  arrangement.  AIL  tests  for  each  of  the  four  annexes 
were  25  items  in  Length. 

During  the  period  of  this  study  L3  classes  were  tested  in  GR01,  L2 
classes  in  GR02,  14  classes  in  GR03,  and  15  classes  in  GR05.  The  number  of 
students  who  were  administered  items  in  a  fixed  order  were  131,  143,  127  and 
87  for  annexes  GR01,  GR02,  GR03  and  GR05,  respectively.  The  number  who 
received  items  in  a  random  order  were  138,  123,  137  and  108,  respectively. 

The  item  pool  for  GR01  was  the  largest,  containing  83  items.  Item  pools 
for  each  of  the  remaining  annexes  contained  75  items  each.  Items  were  pseudo- 
randomly  selected  without  replacement  to  comprise  the  25-item  tests.  Thus 
each  of  the  25-item  tests  consisted  of  different  items,  and  no  items  were 
allowed  to  repeat  until  every  third  class. 

A  typical  testing  session  was  as  follows: 

1.  A  student  was  randomly  assigned  to  either  a  fixed  or  random  item 
presentation  format  by  the  computer  after  logging  on. 

2.  Students  would  respond  to  items  using  a  series  of  training  manuals 
containing  schematics  and  other  pertinent  wiring  diagrams.  Sometimes 
students  would  have  to  refer  to  several  manuals  to  arrive  at  the  answer. 

3.  Students  could  also  use  scrap  paper  for  simple  calculations  as  well 
as  any  class  notes  that  they  may  have  written  into  their  manuals. 

4.  Each  testing  session  would  last  125  minutes  and  was  monitored  by  at 
least  one  instructor. 

5.  If  students  completed  the  test  early,  they  could  sign  off  their 
computer  and  leave  the  room. 

6.  A  review  of  the  test  and  results  was  conducted  immediately  following 
the  testing  session. 
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Results 

Data  collection  from  the  equivalent  groups  started  in  November  of  1987 
and  concluded  in  June  of  1988.  Based  on  the  analysis  of  these  data,  it  was 
concluded  that  although  some  statistically  significant  differences  existed,  no 
substantial,  practical  real  differences  between  the  two  administrative  formats 
were  detected. 

Total  test  score  comparisons  were  made  between  testing  formats  for  each 
of  the  four  annexes.  Figures  1,  2,  3  and  4  display  the  95%  confidence 
intervals  about  the  total  test  score  for  both  the  pencil  and  paper  and  the 
computer  groups  for  each  class  for  each  annex.  In  all  cases  except  two,  the 
confidence  bands  overlap  indicating  a  nonsignificant  difference  between  the 
mean  test  scores  for  each  format.  Only  for  the  third  and  thirteenth  GR03 
classes  do  the  confidence  bands  not  overlap.  These  can  be  partly  explained  by 
the  small  sample,  N  =  3,  for  both  the  computer  groups  for  each  of  these 
classes . 

Mean  scores  and  standard  deviations  for  each  class  for  each  annex  are 
shown  in  Tables  1,  2,  3  and  4.  A  two-way  analysis  of  variance  was  performed 
on  each  class  to  determine  whether  any  class  or  administration  median  effect 
was  present. 

Between-class  differences  yielded  ANOVA  F-statistics  and  p-values  of 
F(12,110)  =  1.82,  p  =  .0540;  F( 11 , 162 )  =  21.95,  p  <  .0000,  F(13,112)  =  3.58, 
p  <  .0001;  and,  F(ll,91)  =  1.67,  p  =  .0925  for  each  of  the  four  respective 
annexes.  Between-f ormat  differences  were  F(l,114)  =  .10,  p  =  .7547;  F(l,163) 

=  1.03,  p  =  .3117;  F( 1 , 121 )  =  1.97,  p  =  .1626;  and,  F(l,94)  =  3.98,  p  =  .0490 
respectively.  All  tests  of  the  interaction  between  class  and  administrative 
formats  were  nonsignificant. 


Annex  GR05  was  che  only  annex  to  yield  a  difference  in  mean  scores  that 
was  significant  at  the  .05  leveL.  This  was  also  the  annex  on  which  the 
students  scored  the  highest,  possibly  resulting  in  some  restriction  of  range 
effects,  resulting  in  smaller  standard  deviations  for  the  scores. 

Tests  between  the  empirical  cummulative  raw  score  distributions  were  also 
computed.  The  cummulative  distributions  are  graphically  displayed  in  Figures 
5,  6,  7,  and  8,  respectively. 

A  two-sided  Kolmogorov-Smi rnov  test  of  no  cummulative  distribution 
differences  was  computed  for  each  annex.  The  test  statistics,  T,  and  the 
associated  p-value  for  GR01,  GR02,  GR03  and  GR05  were  T  =  .519,  p  -  .950; 

T  =  .774,  p  =  .586;  T  =  .631,  p  =  .821,  and  T  =  1.028,  p  =  .241,  respectively. 

The  fact  that  the  Kolnogerov-Smirnov  has  slightly  less  power  than  the  ANOVA,  may 
explain  the  differences  in  the  results.  However,  the  two  analyses  show  that 
any  differences  are  difficult  to  interpret. 

Frequency  distributions  of  the  proportion-correct  values  were  similar  for 
both  administrative  formats  across  the  four  annexes.  These  results  are  displayed 
in  Table  5.  Note  also  that  the  proportion-correct  also  shown  for  items  to  which 
at  least  50  examinees  responded. 

Stepwise  logistic  regression  analyses  were  performed  on  each  item  within 
each  annex  to  test  for  administrative  format  differences  of  item  discrimination 
and  item  difficulty.  The  process  is  described  in  detail  in  the  Appendix.  The 
results  of  the  item  logistic  regression  analyses  are  reported  by  annex  in 
Tables  6,  7,  8  and  9.  In  all,  308  items  were  tested.  Of  the  items  tested, 
seven  items  had  difficulty  (format)  differences  at  a  probability  less  than 
.01.  These  items  were  numbers  60  and  68  from  GR01;  13,  23,  41  and  50  from 
GR03;  and  43  from  GR05.  Discrimination  (format  X  score)  differences  for  five 
items  were  determined  to  be  significant  at  the  same  level:  numbers  30  and  44 
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from  GR01;  8  and  30  from  GR02;  and  60  from  GR03.  The  position  of  each  of 
these  items  in  both  the  fixed  and  random  order  formats  were  examined.  No 
consistent  pattern  of  a  shift  in  difficulty  or  discrimination  was  found  to 
coincide  with  a  shift  in  item  Location. 

For  the  majority  of  items  the  power  to  detect  difficulty  or  discrimination 
differences  may  have  been  extremely  small  because  of  the  small  sample  sizes. 

Also  because  of  the  criterion-referenced  nature  of  the  test,  most  of  the  items 
tended  to  have  a  high  proportion-correct  values,  thus  making  item  difficulty 
or  discrimination  parameters  more  homogeneous. 

Summary 

The  results  of  this  eight  month  research  study  suggest  that  when  the  GRRC 
items  are  administered  via  the  computer,  order  of  item  presentation  makes  at 
most  a  very  small  difference.  This  would  imply  a  lack  of  dependence  between 
items.  That  is,  the  response  to  a  given  item  is  not  affected  by  responses  to 
any  previous  items,  nor  will  it  effect  the  response  to  any  subsequent  items. 

Items  which  were  found  to  have  either  statistically  significant  difficulty 
or  discrimination  differences  were  reviewed  for  possible  clues  as  to  why  the 
differences  exist.  An  analysis  of  the  surface  features  of  the  items  revealed  no 
clue  to  the  cause  of  the  differences,  if  indeed  the  differences  are  real.  Given 
that  three  significant  values  would  be  expected  by  chance,  the  fact  that  there 
were  only  seven  and  five  significant  results  found  implies  that  any  effects  are 
very  weak  indeed.  However,  a  detailed  analysis  of  the  items  by  content  experts, 
or  a  thorough  questioning  of  the  students,  might  reveal  some  explanation  for  the 
minor  differences  that  were  detected.  Such  analyses  were  not  completed. 
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TABLE  1 

Test  Score  Summary  Statistics  by  Class — GR01 


F  i  xed 

Order 

Random  Order 

Class 

N 

X 

SD 

N 

X 

SD 

1 

8 

19.8 

3.2 

9 

19.2 

1.9 

2 

5 

19.0 

3.4 

11 

20.3 

2.1 

3 

11 

20.8 

2.5 

11 

20.3 

3.0 

4 

11 

19.4 

3.2 

14 

20.1 

1.7 

5 

9 

21.6 

1.9 

14 

21.5 

1.7 

6 

11 

20.5 

2.5 

11 

20.0 

2.6 

7 

11 

19.6 

1.7 

13 

20.8 

2.0 

8 

11 

20.8 

2.0 

11 

20.5 

2.2 

9 

6 

21.8 

1.5 

12 

20.2 

3.1 

10 

13 

20.0 

2.1 

7 

19.4 

.8 

11 

13 

19.6 

2.8 

10 

21.0 

2.1 

12 

13 

21.3 

1.8 

6 

22.0 

1.9 

13 

9 

20.4 

2.5 

9 

18.2 

3.8 

Overall 

131 

20.4 

2.4 

138 

20.5 

2.9 

TABLE  2 


Test  Score  Summary  Statistics  by  Class — CR02 


Fixed 

Order 

Random  Order 

Class 

N 

X 

SD 

N 

X 

SD 

X 

12 

16.7 

2.1 

8 

14.1 

2.9 

2 

9 

21.7 

2.2 

13 

19.8 

2.6 

3 

10 

21.5 

1.6 

11 

21.7 

2.0 

4 

9 

21.1 

3.1 

14 

21.9 

2.3 

5 

12 

20.7 

3.9 

10 

20.5 

2.5 

6 

8 

14.9 

2.0 

15 

15.6 

2.2 

7 

13 

17.2 

1.7 

8 

16.8 

2.6 

8 

15 

16.1 

1.6 

10 

16.7 

1.9 

9 

12 

16.7 

2.1 

8 

14.1 

2.9 

10 

13 

17.2 

1.7 

8 

16.8 

2.6 

11 

12 

19.2 

3.4 

6 

21.0 

1.7 

12 

18 

18.1 

2.7 

12 

18.0 

2.6 

Overall 

143 

18.3 

3.1 

123 

18.3 

3.5 
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TABLE  3 

Test  Score  Summary  Statistics  by  Class — GR03 


Fixed  Order 

Random  Order 

Class 

N  X 

SD 

N 

X  SD 

1 

3 

20.2 

3.7 

7 

20.3 

2.1 

2 

8 

16.3 

2.2 

8 

16.5 

3.5 

3 

12 

20.0 

1.9 

3 

17.7 

.6 

4 

8 

19.0 

3.2 

11 

18.3 

2.8 

5 

10 

19.7 

2.9 

13 

18.1 

2.5 

6 

9 

19.3 

3.1 

15 

18.5 

3.7 

7 

8 

17.9 

2.6 

15 

17.5 

3.1 

8 

14 

17.1 

4.0 

11 

17.7 

4.3 

9 

6 

18.3 

2.8 

13 

17.2 

2.0 

10 

10 

21.1 

2.2 

9 

19.2 

3.7 

11 

5 

20.2 

3.7 

7 

20.3 

2.1 

12 

8 

16.3 

2.2 

8 

16.5 

3.5 

13 

12 

19.7 

1.9 

3 

17.7 

.6 

14 

12 

18.4 

2.5 

12 

20.3 

2.4 

irall 

127 

18.8 

3.0 

137 

18.3 

3.1 
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TABLE  4 

Test  Score  Summary  Statistics  by  Class  -  CR05 


I 


Fixed  Order 

Random  Order 

Class 

N  X 

SD 

N 

X  SD 

1 

7 

20.4 

2.2 

6 

20.7 

3.1 

2 

6 

21.5 

1.9 

7 

20.7 

.8 

3 

6 

20.8 

3.2 

5 

20.8 

2.4 

4 

8 

22.6 

1.8 

14 

22.1 

2.2 

5 

5 

21.8 

1.6 

12 

19.8 

3.2 

6 

7 

20.3 

3.4 

5 

19.6 

1.1 

7 

6 

21.0 

1.3 

14 

18.7 

2.4 

8 

4 

21.8 

2.6 

13 

19.9 

3.7 

9 

10 

21.3 

2.5 

7 

21.9 

2.0 

10 

6 

21.5 

1.6 

8 

22.3 

1.8 

11 

10 

22.4 

2.5 

6 

20.7 

2.3 

12 

12 

20.3 

2.1 

11 

20.3 

2.4 

>rall 

87 

21.3 

2.3 

108 

20.5 

2.7 

Frequency  Distributions  of  Proportion  Correct  Values 
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Note:  Frequencies  in  parentheses  represent  examinee  samples  >  50. 
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TABLE  6 

Logistic  Regression  Results  of  CR01 


Proportion 

Correct 

Improvement 

*2  p-values 

Format  Effect 

Format  by 
Score  Effect 

Item  # 

-f 

N 

-r 

f 

r 

(Difficulty) 

(Discrimination) 

1 

11 

ii 

1.00 

1.00 

1.000 

1.000 

2 

35 

33 

.83 

.94 

.285 

.955 

4 

21 

19 

.52 

.68 

.297 

.265 

5 

15 

21 

1.00 

1.00 

1.000 

1.000 

6 

11 

11 

.91 

1.00 

.251 

.995 

7 

13 

7 

.92 

.86 

.549 

.466 

8 

11 

11 

.91 

.91 

.924 

.055 

9 

14 

25 

1.00 

1.00 

1.000 

1.000 

10 

39 

45 

.97 

.96 

.622 

.727 

11 

51 

48 

.96 

.96 

.961 

.207 

12 

53 

64 

1.00 

.98 

.315 

.989 

13 

46 

41 

.89 

.85 

.578 

.748 

14 

43 

43 

.88 

.84 

.540 

.292 

15 

46 

43 

.72 

.63 

.318 

.637 

16 

53 

64 

1.00 

.94 

.052 

.974 

17 

35 

35 

1.00 

1.00 

1.000 

1.000 

18 

38 

42 

.74 

.71 

.821 

.970 

19 

31 

33 

.97 

1.00 

.223 

.987 

20 

50 

53 

.84 

.85 

.845 

.423 

21 

35 

29 

.69 

.55 

.303 

.419 

22 

44 

61 

.80 

.80 

.975 

.183 

23 

46 

53 

.83 

.81 

.815 

.205 

24 

56 

47 

.80 

.81 

.931 

.964 

25 

43 

41 

.91 

1.00 

.018 

.984 

13 


I 

i  Table  6,  cont. 


Proportion 

Correct 

Improvement 

*2  £-values 

Format  Effect 

Format  by 
Score  Effect 

Item  # 

-f 

N 

-r 

f 

r 

(Difficulty) 

(Discrimination) 

26 

27 

32 

.89 

.91 

|  .814 

.601 

27 

53 

58 

.92 

.90 

|  .566 

.678 

28 

30 

31 

.97 

1.00 

|  .213 

.988 

29 

61 

67 

.74 

.70 

|  .679 

.230 

30 

48 

47 

.81 

.89 

|  .066 

.000 

31 

23 

34 

.96 

.85 

|  .208 

.046 

32 

54 

50 

.98 

.98 

|  .988 

.435 

33 

54 

54 

.96 

.96 

j  .996 

.235 

34 

29 

29 

1.00 

.90 

|  .037 

.995 

35 

50 

56 

.98 

.96 

|  .612 

.472 

36 

52 

53 

1.00 

1.00 

1.000 

1.000 

37 

54 

55 

.98 

.98 

.968 

.445 

38 

39 

43 

.97 

.95 

.628 

.723 

39 

39 

40 

.80 

.75 

.978 

.499 

40 

63 

67 

.95 

.97 

.593 

.554 

41 

44 

43 

.98 

.98 

.963 

.469 

42 

25 

33 

.92 

.97 

.349 

.923 

43 

41 

41 

.51 

.49 

.803 

.557 

44 

48 

47 

CO 

rn 

• 

.34 

.740 

.004 

45 

41 

45 

1.00 

1.00 

1.000 

1.000 

46 

26 

40 

1.00 

1.00 

1.000 

1.000 

47 

56 

44 

.98 

1.00 

.307 

.989 

48 

49 

54 

.90 

.93 

.622 

.789 

49 

33 

30 

.97 

.93 

.949 

.073 

50 

35 

32 

.94 

1.00 

.106 

.999 

14 


Table  6,  cont. 


Proportion 

Correct 

Improvement 

x 2  £-values 

Format  Effect 

Format  by 
Score  Effect 

Item  # 

Hf 

N 

-r 

f 

r 

(Di f  f icul ty) 

(Discrimination) 

51 

13 

6 

1.00 

1.00 

1.000 

1.000 

52 

15 

26 

1.00 

.96 

.338 

.999 

53 

16 

22 

.88 

.91 

.740 

.409 

54 

19 

22 

.68 

.82 

.335 

.178 

55 

33 

31 

1.00 

1.00 

1.000 

1.000 

56 

51 

56 

.55 

.75 

.029 

.379 

57 

53 

46 

.32 

.26 

.536 

.976 

58 

28 

32 

.64 

.78 

.128 

1.000 

59 

53 

64 

.57 

.50 

.563 

.549 

60 

74 

72 

.93 

1.00 

.005 

.985 

61 

49 

48 

.94 

.88 

.294 

.545 

62 

24 

21 

.50 

.71 

.155 

.906 

63 

50 

53 

.62 

.57 

.859 

.078 

64 

42 

37 

.69 

.49 

.028 

.384 

65 

44 

59 

.50 

.48 

.859 

.463 

66 

27 

35 

.74 

.91 

.076 

.905 

67 

53 

49 

.81 

.88 

.414 

.777 

68 

20 

25 

.70 

.28 

.004 

.951 

69 

25 

34 

.36 

.59 

.068 

.420 

70 

59 

52 

.75 

.69 

.453 

.612 

71 

53 

55 

.59 

.64 

.503 

.460 

72 

52 

55 

.56 

.53 

.691 

.603 

73 

38 

40 

.82 

.73 

.341 

.480 

74 

38 

44 

.74 

.61 

.189 

.103 

75 

61 

64 

.80 

.83 

.692 

.906 

15 


Proportion  Correct 


Improvement  x2  £-values 


Format  Effect 


Format  by 
Score  Effect 


16 


TABLE  7 


Logistic  Regression  Results  of  CR02 


Proport  ion 

Correct 

Improvement 

x2  £-values 

Format  Effect 

Format  by 
Score  Effect 

Item  # 

Sf 

N 

-r 

f 

r 

(Difficulty) 

(Discrimination) 

1 

30 

18 

.87 

.94 

.390 

.013 

2 

34 

31 

.00 

.03 

.252 

.994 

3 

39 

26 

.82 

.65 

.685 

.063 

4 

47 

41 

.00 

.00 

1.000 

1.000 

5 

30 

18 

.77 

.78 

.991 

.348 

6 

26 

16 

1.00 

1.00 

1.000 

1.000 

7 

58 

47 

.03 

.06 

.484 

.795 

8 

15 

10 

.33 

.20 

.434 

.004 

9 

30 

18 

.90 

.90 

.802 

.163 

10 

63 

53 

.97 

.98 

.643 

.635 

11 

42 

43 

.74 

.65 

.349 

.799 

12 

22 

17 

.77 

.82 

.910 

.549 

13 

70 

60 

.94 

.95 

.775 

.516 

14 

78 

75 

.73 

.71 

.748 

.651 

15 

80 

81 

.68 

.57 

.176 

.138 

16 

26 

16 

.31 

.38 

.590 

.049 

17 

44 

37 

.00 

.08 

.051 

.992 

18 

33 

22 

.67 

.77 

.397 

.950 

19 

74 

79 

.60 

.61 

.866 

.950 

20 

55 

46 

.87 

.80 

.369 

.110 

21 

73 

70 

.90 

.94 

.470 

.439 

22 

44 

48 

.89 

.81 

.306 

.055 

23 

71 

56 

.89 

.96 

.082 

.090 

24 

40 

29 

.88 

.97 

.202 

.748 

25 

63 

73 

.92 

.90 

.726 

.726 

17 


Table  7  (cont.) 


Proportion  Correct  Improvement  x2  p-values 

Format  by 

Format  Effect  Score  Effect 

Item  f  r  (Difficulty)  (Discrimination) 


26 

74 

65 

00 

oo 

• 

.82  j 

.232 

.860 

27 

46 

33 

.89 

.73  | 

.096 

.623 

28 

72 

53 

.90 

.89  | 

.148 

.742 

29 

38 

48 

.55 

.92  | 

.568 

.010 

30 

33 

22 

.76 

.86  | 

.339 

.001 

31 

30 

18 

.93 

.94  j 

.936 

.312 

32 

44 

48 

.89 

.81  | 

.325 

.297 

33 

69 

57 

.71 

.67  | 

.628 

.210 

34 

57 

52 

1.00 

1.00  | 

1.000 

1.000 

35 

70 

66 

.93 

.94  | 

.864 

.867 

36 

56 

53 

.98 

1,00  | 

.299 

.974 

37 

34 

27 

.94 

.96  | 

.543 

.089 

38 

68 

59 

.99 

.92  | 

.061 

.044 

39 

41 

45 

.93 

.93  | 

.988 

.153 

40 

15 

10 

.93 

1.00  | 

.096 

.931 

41 

30 

18 

.80 

.94  | 

.159 

.192 

42 

64 

53 

.78 

.70  | 

.315 

.403 

43 

69 

67 

.88 

.84  | 

.428 

.977 

44 

74 

79 

.95 

.90  | 

.271 

.095 

45 

12 

6 

.08 

.60  | 

.891 

.947 

46 

44 

28 

.07 

.07  | 

.972 

.012 

47 

33 

22 

.21 

.18  | 

.792 

.515 

48 

61 

60 

.98 

.97  | 

.469 

.463 

49 

55 

47 

.71 

.66  | 

.549 

.043 

50 

70 

66 

.81 

.82  | 

.751 

.458 

51 

62 

64 

.76 

.64  | 

.170 

.433 

52 

50 

32 

.08 

.00  | 

.015 

.983 

53 

59 

51 

.19 

.22  | 

.272 

.926 

54 

33 

22 

.00 

.00  | 

1.000 

1.000 

55 

34 

31 

.79 

.68  | 

.518 

.974 

18 


Table  7,  cont. 


Proport  ion 

Correct 

Improvement 

x2  ^-values 

Format  Effect 

Format  by 
Score  Effect 

Item  # 

Hf 

N 

-r 

f 

r 

( Di f f icul ty ) 

(Discrimination) 

56 

89 

81 

.80 

.64 

|  .045 

.803 

57 

39 

26 

.00 

.00 

|  1.000 

1.000 

58 

65 

52 

.86 

.83 

|  .440 

.821 

59 

26 

16 

.39 

.50 

|  .350 

.001 

60 

55 

58 

.86 

.86 

|  .992 

.154 

61 

38 

33 

.76 

.94 

|  .016 

.618 

62 

39 

26 

.74 

.65 

|  .442 

.365 

63 

50 

32 

.76 

.69 

|  .867 

.894 

64 

47 

47 

.77 

.70 

|  .436 

.619 

65 

39 

26 

.90 

.96 

)  .297 

.239 

66 

44 

28 

.52 

.61 

|  .466 

.845 

67 

36 

22 

1.00 

.91 

|  .162 

.986 

68 

34 

31 

.71 

.87 

|  .105 

.187 

69 

63 

73 

.75 

.75 

|  .924 

.776 

70 

39 

26 

.82 

.69 

|  .133 

.121 

71 

61 

60 

.90 

.83 

|  .128 

.054 

72 

53 

62 

.87 

.90 

|  .602 

.281 

73 

26 

16 

.85 

1.00 

1  .037 

.996 

74 

12 

6 

.83 

1.00 

|  .231 

.998 

75 

36 

22 

.58 

.68 

|  .076 

.151 

19 


Table  8 


Logistic  Regression  Results  of  GR03 


Proportion 

Correct 

Improvement 

* 2  £-values 

Format  Effect 

Format  by 
Score  Effect 

Item  # 

N 

-r 

f 

r 

(Difficulty) 

(Discriminat ion ) 

1 

41 

51 

.98 

1.00 

.204 

.987 

2 

48 

43 

.98 

.98 

.829 

.089 

3 

38 

41 

.84 

.85 

.900 

.500 

4 

34 

19 

.32 

.37 

.255 

.056 

3 

44 

52 

.96 

.96 

.980 

.731 

6 

49 

64 

.65 

.77 

.152 

.327 

7 

48 

43 

1.00 

1.00 

1.000 

1.000 

8 

45 

52 

1.00 

.96 

.096 

.990 

9 

34 

40 

.53 

.50 

.967 

.864 

10 

40 

53 

.63 

.59 

.792 

.728 

11 

40 

50 

.75 

.84 

.276 

.969 

12 

47 

32 

.92 

.75 

.064 

.695 

13 

68 

53 

.88 

.64 

.001 

.361 

14 

28 

38 

.71 

.63 

.564 

.429 

15 

31 

44 

1.00 

.89 

.019 

.994 

16 

41 

51 

.68 

.59 

.337 

.361 

17 

44 

28 

.98 

1.00 

.204 

.986 

18 

42 

56 

.43 

.39 

.785 

.461 

19 

44 

49 

.64 

.78 

.063 

.107 

20 

45 

52 

.76 

.85 

.383 

.022 

21 

43 

56 

.35 

.43 

.451 

.204 

22 

31 

41 

.55 

.59 

.988 

.198 

23 

40 

40 

.58 

.85 

.003 

.943 

24 

52 

39 

.94 

.85 

.479 

.763 

25 

48 

51 

.77 

.65 

.175 

.618 

20 


Table  8,  cont. 


Proportion  Correct  Improvement  . 2  p-values 

Format  by 

Format  Effect  Score  Effect 


T  t  em  # 

N 

-r 

f 

r 

(Difficulty) 

( Di sc  r imi na t i on  ) 

26 

32 

40 

.69 

.  76  | 

.432 

.123 

27 

50 

41 

.96 

.93  | 

.602 

.996 

28 

26 

29 

.69 

.66  | 

.772 

.411 

29 

50 

62 

.  76 

.77  | 

.610 

.018 

30 

58 

44 

.86 

.86  | 

.876 

.755 

31 

40 

51 

.88 

.75  | 

.059 

.937 

32 

41 

32 

.68 

.72  | 

.514 

.346 

33 

52 

44 

.31 

.52  | 

.027 

.406 

34 

36 

49 

.69 

.80  | 

.151 

.964 

35 

36 

54 

.42 

.52  | 

.342 

.098 

36 

33 

40 

.55 

.53  | 

.757 

.552 

37 

53 

44 

.91 

.91  | 

.794 

.970 

38 

38 

34 

1.00 

.97  | 

.469 

.982 

39 

41 

53 

.73 

.55  | 

.101 

.827 

40 

53 

63 

.81 

.86  | 

.613 

.664 

41 

36 

38 

.94 

.63  | 

.000 

.403 

42 

38 

34 

.76 

.77  | 

.721 

.064 

43 

35 

51 

.86 

.71  | 

.222 

.567 

44 

41 

32 

.56 

.53  j 

.835 

.239 

45 

51 

46 

.94 

.85  | 

.222 

.569 

46 

54 

61 

.80 

.78  | 

.686 

.035 

47 

58 

65 

.79 

.71  | 

.193 

.468 

48 

40 

53 

.88 

.70  | 

.055 

.405 

49 

36 

38 

.86 

.63  | 

.024 

.823 

50 

40 

40 

.98 

.78  | 

.004 

.018 

21 


Table  8,  cont. 


Proportion  Correct  Improvement  x2  j-values 

Format  by 

Format  Effect  Score  Effect 


Item  # 

-f 

N 

-r 

f 

r 

(Difficulty) 

(Discrimination) 

51 

37 

54 

.97 

.94 

I 

1 

.503 

.293 

52 

64 

52 

.41 

.29 

1 

.314 

.364 

53 

20 

27 

.90 

.96 

1 

.330 

.961 

54 

32 

21 

.91 

.67 

1 

.065 

.399 

55 

66 

50 

.33 

.46 

! 

.095 

.270 

56 

34 

19 

.91 

1.00 

1 

.031 

.990 

57 

51 

46 

.69 

.76 

1 

.467 

.853 

58 

33 

42 

.49 

.41 

1 

.479 

.235 

59 

58 

65 

.78 

.72 

1 

.352 

.950 

60 

19 

29 

.21 

.24 

1 

.849 

.002 

61 

35 

51 

.80 

.90 

i 

.110 

.237 

62 

49 

51 

.49 

.59 

1 

.227 

.287 

63 

42 

49 

.62 

.59 

1 

.845 

.473 

64 

37 

49 

.89 

.86 

1 

.823 

.586 

65 

34 

51 

.88 

.92 

1 

.274 

.454 

66 

44 

54 

.68 

.70 

1 

.804 

.342 

67 

54 

43 

.87 

.88 

1 

.772 

.192 

68 

44 

47 

.71 

.66 

1 

.743 

.871 

69 

44 

33 

.68 

.79 

1 

.191 

.078 

70 

26 

29 

.54 

.79 

1 

.055 

.775 

71 

52 

44 

.90 

.89 

1 

.933 

.562 

72 

30 

44 

.70 

.66 

1 

.719 

.858 

73 

52 

65 

.63 

.59 

1 

.588 

.610 

74 

38 

52 

.90 

.85 

1 

.497 

.707 

75 

47 

32 

.98 

.97 

1 

.766 

.745 

2 


TABLE  9 


Logistic  Regression  Results  of  GR05 


Proportion  Correct  Improvement  x2  £-values 

Format  by 

Format  Effect  Score  Effect 


Item  # 

-f 

N 

-r 

f 

r 

(Difficulty) 

(Discrimination 

1 

25 

47 

.96 

.96  | 

.968 

.768 

2 

41 

49 

.95 

.92  j 

.548 

.171 

3 

35 

30 

.94 

.90  | 

.506 

.440 

4 

32 

43 

.88 

.77  | 

.210 

.223 

5 

12 

13 

1.00 

.92  | 

.146 

.986 

6 

29 

34 

.97 

.97  | 

.866 

.110 

7 

33 

47 

.94 

.94  | 

.752 

.858 

8 

48 

56 

.92 

.80  | 

.096 

.593 

9 

31 

33 

.90 

.79  | 

.147 

.783 

10 

21 

28 

.95 

1.00  | 

.110 

.970 

11 

34 

37 

.85 

.81  | 

.518 

.414 

12 

16 

32 

.69 

.84  | 

.134 

.111 

13 

22 

25 

.91 

.84  | 

.549 

.730 

14 

35 

43 

.57 

.98  | 

.750 

.849 

15 

28 

42 

.96 

.88  | 

.239 

.442 

16 

21 

25 

.71 

.48  | 

.153 

.400 

17 

31 

31 

.52 

.61  | 

.376 

.869 

18 

28 

33 

1.00 

.97  | 

.365 

.990 

19 

17 

33 

.88 

.91  | 

.740 

.383 

20 

28 

33 

1.00 

.94  | 

.194 

.990 

21 

28 

30 

.39 

.33  | 

.869 

.849 

22 

48 

56 

.83 

.84  | 

.952 

.667 

23 

36 

46 

.97 

.96  | 

.920 

.217 

24 

10 

18 

1.00 

1.00  | 

1.000 

1.000 

25 

43 

42 

.81 

.88  | 

.419 

.847 

23 


Table  9,  cont . 


Proportion 

Correct 

Improvement 

* 2  £-values 

Format  Effect 

Format  by 
Score  Effect 

Item  # 

9f 

N 

-r 

f 

r 

(Difficulty) 

(Discrimination) 

26 

35 

52 

.63 

.44 

.139 

.760 

27 

16 

14 

1.00 

.93 

.155 

.985 

28 

29 

40 

.93 

.80 

.165 

.963 

29 

25 

37 

.92 

.81 

.243 

.859 

30 

22 

19 

1.00 

1.00 

1.00 

1.00 

31 

19 

28 

.90 

.93 

.507 

.236 

32 

20 

27 

.80 

.85 

.660 

.381 

33 

32 

43 

.91 

.81 

.346 

.946 

34 

41 

49 

1.00 

.94 

.059 

.995 

35 

44 

43 

1.00 

.93 

.036 

.995 

36 

34 

39 

.97 

.97 

.902 

.555 

37 

22 

37 

1.00 

.89 

.095 

.990 

38 

32 

38 

.94 

.95 

.881 

.293 

39 

28 

30 

.86 

.80 

.812 

.692 

40 

32 

51 

.47 

.51 

.432 

.334 

41 

28 

35 

.89 

.86 

.916 

.546 

42 

35 

38 

.83 

.76 

.711 

.936 

43 

17 

25 

.77 

1.00 

.001 

.982 

44 

40 

45 

.98 

.98 

.656 

.010 

45 

24 

32 

1.00 

.94 

.153 

.993 

46 

48 

64 

1.00 

.97 

.155 

.995 

47 

19 

28 

.95 

.97 

.362 

.018 

48 

45 

58 

.80 

.78 

.660 

.934 

49 

22 

21 

.91 

.95 

.462 

.563 

50 

42 

38 

.76 

.79 

.718 

.876 

24 


Table  9,  cont. 


Item  # 

-f 

ti 

-  r 

Proport  ion 

f 

Correct 

r 

Improvement 

Format  Effect 

(Difficulty) 

x2  g-values 

Format  by 
Score  Effect 

(Discrimination) 

51 

30 

45 

.87 

.98 

1 

.036 

.288 

52 

16 

34 

1.00 

.97 

1 

.480 

1.000 

53 

34 

36 

.88 

.89 

1 

.732 

.865 

54 

28 

23 

.75 

.83 

1 

.519 

.362 

55 

29 

44 

.69 

.50 

1 

.248 

.877 

56 

18 

32 

.94 

.91 

1 

.914 

.082 

57 

31 

29 

.55 

.72 

1 

.153 

.582 

58 

33 

33 

.61 

.58 

1 

.429 

.591 

59 

49 

55 

.53 

.55 

1 

.884 

.954 

60 

44 

55 

.84 

.91 

1 

.271 

.133 

61 

10 

18 

1.00 

.83 

1 

.117 

.988 

62 

43 

51 

.70 

.65 

1 

.577 

.369 

63 

12 

15 

.92 

.53 

1 

.027 

.490 

64 

34 

54 

.94 

.80 

1 

.443 

.268 

65 

16 

32 

.75 

.69 

1 

1 

.729 

.488 

66 

28 

25 

.82 

.88 

1 

.536 

.791 

67 

29 

35 

.52 

.40 

1 

.430 

.302 

68 

33 

23 

.94 

.87 

1 

.367 

.897 

69 

26 

18 

.85 

.89 

1 

.757 

.112 

70 

28 

53 

.82 

.76 

1 

.851 

.944 

71 

16 

24 

.94 

.83 

1 

.315 

.043 

72 

20 

19 

.95 

.90 

1 

.310 

.572 

73 

23 

41 

.87 

.73 

1 

.484 

.145 

74 

23 

18 

1.00 

.94 

1 

.032 

.856 
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APPENDIX 

The  3  steps  used  in  the  stepwise  Logistic  regression  anaLyses  are  as 
fol lows : 

Step  1.  Fit 

exp  [s  •  +  8,-X.  +  +  8,.M.*X.] 

v  =  LJ  1  2J  1  3J  1 

i j  1  +  exp  I  8  ■  +  S,-X.  +  B7.M.  +  8,-M.*X.|  ’ 

J  r  1  oj  1  2j  1  3j  t  iJ 

where 

1,  if  ith  examinee  answered  item  j  correctly, 

\ .  .  = 

^  0,  otherwise, 

X.  =  ith  examinee's  total  test  score  minus  Y. 

i  -  ij 

+1,  if  ith  examinee  used  paper-pencil, 

M.  * 

-1,  if  ith  examinee  used  computer, 

2 

and  Test  G  (8  8,.,  8,*>  8,-). 

oj  lJ  2J  3J 


□Fixed  Order 
e Random  Order 


■  i  i  i  i  i  i  i  i - 1  i - 1  i - 1 

1  2345678910  11  1213  TDH1. 

Close 


Figure  1.  The  95%  Confidence  Interval  about  the  Mean  Total  Test  Score  for 
Both  the  Pencil  &  Paper  and  Computer  Groups  for  Each  GR01  Class. 
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Figure  2.  The  95Z  Confidence  Interval  about  the  Mean  Total  Test  Score  for 
Both  the  Pencil  &  Paper  and  Computer  Croups  for  Each  GR02  Class. 
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□Fixed  Order 
e Random  Order 
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Figure  3.  The  95%  Confidence  Interval  about  the  Mean  Total  Test  Score  for 
Both  the  Pencil  &  Paper  and  Computer  Groups  for  Each  GR03  Class. 
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Figure  4.  The  95%  Confidence  Interval  about  the  Mean  Total  Test  Score  for 
Both  the  Pencil  &  Paper  and  Computer  Groups  for  Each  GR05  Class. 
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Figure  5.  Cumulative  Frequency  Polygons  of  Total  Test  Scores  for  Fixed  and 
Random  Order  Presentations  for  Annex  CR01. 
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Figure  6.  Cumulative  Frequency  Polygons  of  Total  Test  Scores  for  Fixed  and 
Random  Order  Presentations  for  Annex  GR02. 
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Figure  7.  Cumulative  Frequency  Polygons  of  Total  Test  Scores  for  Fixed  and 
Random  Order  Presentations  for  Annex  GR03. 
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Figure  8.  Cumulative  Frequency  Polygons  of  Total  Test  Scores  for  Fixed  and 
Random  Order  Presentations  for  Annex  GR05. 
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